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Personalized spiders for web search and analysis 
Michael Chau, Daniel Zeng, Hinchun Chen 

January 2001 Proceedings of the 1st ACM/IEEE-CS joint conference on Digital 
libraries 

Publisher: ACM Press 

Additional Information: full citation , abstract , references , citings , index 
terms 



Full text available: 



Searching for useful information on the World Wide Web has become incr easingly 
difficult. While Internet search engines have been helping people to search on the web, 
low recall rate and outdated indexes have become more and more problematic as the web 
grows. In addition, search tools usually present to the user only a list of search results, 
failing to provide further personalized analysis which could help users Identify useful 
information and comprehend these results. To alleviate these ... 

Keywords: information retrieval, internet searching and browsing, internet spider, noun- 
phrasing, personalization, self-organizing map 



2 Visual information retrieval from lar g e distributed online repositories 
Shih-Fu Chang, John R. Smith, Mandis Beigi, Ana Benitez 
December 1997 Communications of the ACM, volume 40 issue 12 

Publisher: ACM Press 

Full text available: ^| pdf(1.96 MB) Additional Information: full citation , references , citings, index terms 




STARTS: Stanford proposal for Internet meta-searchin g 

Luis Gravano, Chen-Chuan K. Chang, Hector Garcia-Molina, Andreas Paepcke 

June 1997 ACM SIGMOD Record , Proceedings of the 1997 ACM SIGMOD international 

conference on Management of data SIGMOD '97, volume 26 issue 2 
Publisher: ACM Press 

Full text available- 153 DdfM 53 MB) Additional Information: full citation , abstract , references , cjtings, index 
' H F — ^ terms 

Document sources are available everywhere, both within the internal networks of 
organizations and on the Internet. Even individual organizations use search engines from 
different vendors to index their internal document collections. These search engines are 
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typically incompatible in that they support different query models and interfaces, they do 
not return enough information with the query results for adequate merging of the results, 
and finally, in that they do not export metadata about t ... 

Link-based ranking 2: Searchin g the workplace web 

Ronald Fagin, Ravi Kumar, Kevin S. McCurley, Jasmine Novak, D. Sivakumar, John A. Tomlin, 
David P. Williamson 

May 2003 Proceedings of the 12th international conference on World Wide Web 
Publisher: ACM Press 

.. Ul a jf/oo'i ec i/n\ Additional Information: full citation , abstract , references , citings , index 

Full text available: TO pdf(231.55 KB) ; 

^ terms 

The social impact from the World Wide Web cannot be underestimated, but technologies 
used to build the Web are also revolutionizing the sharing of business and government 
information within intranets. In many ways the lessons learned from the Internet carry 
over directly to intranets, but others do not apply. In particular, the social forces that 
guide the development of intranets are quite different, and the determination of a "good 
answer" for intranet search is quite different than on the Int ... 

Cyberspace 2000: dealing with information overload 
Hal Berghel 

February 1997 Communications of the ACM, volume 40 issue 2 
Publisher: ACM Press 

Full text available: ^| pdf(343.30 KB) Additional Information: full citation , references , citings, index terms 



6 Extracting classification knowled g e of Internet documents with minin g term 
^ associations: a semantic approach 

^ Shian-Hua Lin, Chi-Sheng Shih, Meng Chang Chen, Jan-Ming Ho, Ming-Tat Ko, Yueh-Ming 
Huang 

August 1998 Proceedings of the 21st annual international ACM SIGIR conference on 
Research and development in information retrieval 

Publisher: ACM Press 

Full text available: g| pdf(1 .02 MB) Additional Information: full citation , references , citings, index terms 
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Short Papers: Automatically indexin g documents: content vs. reference 
Shannon Bradshaw, Kristian Hammond 

January 2002 Proceedings of the 7th international conference on Intelligent user 

interfaces 
Publisher: ACM Press 

Full text available: *g pdf(1 06.80 KB) Additional Information: full citation , abstract , references , index terms 

Authors cite other work in many types of documents. Notable among these are research 
papers and web pages. Recently, several researchers have proposed using the text 
surrounding citations (references) as a means of automatically indexing documents for 
search engines, claiming that this technique is superior to indexing documents based on 
their content [1,2]. While we ourselves have made this claim, we acknowledge that little 
empirical data has been presented to support it. Therefore, in the limi ... 

Keywords: indexing precision, reference-based indexing, term diversity 
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H. Gregory Silber, Kathleen F. McCoy 

January 2000 Proceedings of the 5th international conference on Intelligent user 

interfaces 
Publisher: ACM Press 

i- ii * ^ i ui 0 ,x /coc ™ ve>\ Additional Information: full citation , abstract , references , citing s, index 
Full text available: TO paf (586.39 KB ) - 

^ terms 

The rapid growth of the Internet has resulted in enormous amounts of information that 
has become more difficult to access efficiently. Internet users require tools to help 
manage this vast "quantity of information. The primary goal of this research is to create an 
efficient and effective tool that is able to summarize large documents quickly. This 
research presents a linear time algorithm for calculating lexical chains which is a method 
of capturing the "aboutness" of a document. ... 

Keywords: NLP, algorithm, cohesion, lexical chains, linguistics, summarization 



Living Web: supportin g Internet-based user-centered desi gn 
Jeffrey D. Smith, Kenji Takahashi, Eugene Liang 
April 1999 ACM SIGGROUP Bulletin, volume 20 issue 1 

Publisher: ACM Press 

Full text available: pdf(624.17 KB) Additional Information: full citation , abstract , index terms 

In this paper, we describe an Internet-based platform and applications which address 
problems encountered in user-centered design. The issues of coordination and 
management, variety of representations, and the iterative nature of the design process 
are discussed along with solutions provided by our approach. We give actual examples of 
usage of our system and some issues for future consideration. 

Keywords: HCI, WWW, artifacts, collaboration, multimedia 



10 Knowled g e sharin g , qualit y, and intermediation 
Claire Vishik, Andrew B. Whinston 

^ March 1999 ACM SIGSOFT Software Engineering Notes , Proceedings of the 

international joint conference on Work activities coordination and 
collaboration WACC '99, volume 24 issue 2 
Publisher: ACM Press 

Full text available: ^pdf(1.33 MB) Additional Information: full citation , abstract , references , index terms 

Informal publishing flourished in the World Wide Web environment, where every user with 
a sufficient level of access can become a publisher. Although it appears that in such an 
environment intermediation in the distribution and sharing of information becomes 
unnecessary, the uneven quality of information and resulting quality uncertainty of 
information users, together with the increased search efforts, represent a sufficient 
reason for information and knowledge intermediaries to preserve and eve ... 

Keywords: Internet, World Wide Web, economics of information, information exchange, 
intermediation, knowledge management 

11 Novel search environments: Comparison of two approaches to building a vertical 
<§> search tool: a case study in the nanotechnolo g y domain 

V Michael Chau, Hsinchun Chen, Jialun Qin, Yilu Zhou, Yi.Qin, Wai-Ki Sung, Daniel McDonald 
July 2002 Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries 
Publisher: ACM Press 

Additional Information: full citation , abstract , references , citings, index 
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Full text available: 'g pdf(859.29 KB) terms 

As the Web has been growing exponentially, it has become increasingly difficult to search 
for desired information. In recent years, many domain-specific (vertical) search tools 
have been developed to serve the information needs of specific fields. This paper 
describes two approaches to building a domain-specific search tool. We report our 
experience in building two different tools in the nanotechnology domain (1) a server- 
side search engine, and (2) a client-side search agent. The designs of ... 

Keywords: indexing, information retrieval, internet searching and browsing, internet 
spider, noun-phrasing, personalization, post- retrieval analysis, self-organizing map, 
summarization, vertical search engine, web search engine 



12 Web annotator 
Dale Reed, Sam John 

January 2003 ACM SIGCSE Bulletin , Proceedings of the 34th SIGCSE technical 

symposium on Computer science education SIGCSE '03, volume 35 issue i 
Publisher: ACM Press 

Additional Information: full citation , abstract , references , citings, index 



Full text available:' r _., 

terms 

The World Wide Web is increasingly becoming an integrated extension of users' 
computing environments, with content indexed and retrieved through Web browsers. Web 
browsers are increasingly being used as computer science curriculum delivery mechanism, 
for both books delivered as local content on CD ROMs as well as server-based 
material.Traditional computer science curriculum has often been presented through static 
printed media. What has been printed ahead of time in books or handouts can not be ... 

Keywords: annotation, browser plugin, collaborative design, web-based curriculum 



13 GIOSS: text-source discovery over the Internet i 
Luis Gravano, Hector Garcia-Molina, Anthony Tomasic 

June 1999 ACM Transactions on Database Systems (TODS), volume 24 issue 2 
Publisher: ACM Press 

Full text available-I B Pdff230.37 KB) Additional lnformation: Ml citation, abstract, references, citings, index 

terms , review 

The dramatic growth of the Internet has created a new problem for users: location of the 
relevant sources of documents. This article presents a framework for (and experimentally 
analyzes a solution to) this problem, which we call the text-source discovery problem. Our 
approach consists of two phases. First, each text source exports its contents to a 
centralized service. Second, users present queries to the service, which returns an 
ordered list of promising text sources. T ... 

Keywords: Internet search and retrieval, digital libraries, distributed information 
retrieval, text databases 



14 Experiences with selecting search engines using metasearch 
Daniel Dreilinger, Adele E. Howe 

July 1997 ACM Transactions on Information Systems (TOIS), volume is issue 3 
Publisher: ACM Press 

• Full text available* 1S| pdf( 428 65 KB) ' Add'*' 003 ' Information: full citation , abstract , references , citings , index 
"™ : terms , review 

Search engines are among the most useful and high-profile resources on the Internet. 
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The problem of finding information on the Internet has been replaced with the problem of 
knowing where search engines are, what they are designed to retrieve, and how to use 
them. This article describes and evaluates SavvySearch, a metasearch engine designed to 
intelligently select and interface with multiple remote search engines. The primary 
metasearch issue examined is the importance of carefully selecti ... 

Keywords: WWW, information retrieval, machine learning, search engine 



15 Ontolog y-sup ported and ontology-driven conceptual navigation on the World Wide 
£> Web 

^ Michel Crampes, Sylvie Ranwez 

May 2000 Proceedings of the eleventh ACM on Hypertext and hypermedia 

Publisher: ACM Press 

Full text available: ^ pdfd 98.01 KB) Additional Information: full citation , references , citings , index terms 



Keywords: WWW, XML, adaptive hypertext, conceptual navigation, metadata, narration, 
ontology, time optimization 



16 Global digital museum: multimedia information access and creation on the Internet 
Junichi Takahashi, Takayuki Kushida, Jung-Kook Hong, Shigeharu Sugita, Yasuyuki Kurita, 
Robert Rieger, Wendy Martin, Geri Gay, John Reeve, Rowena Loverance 
May 1998 Proceedings of the third ACM conference on Digital libraries 
Publisher: ACM Press 

Full text available: gpdfd. 41 MB) Additional Information: full citation , references , citings , index terms 



17 The effectiveness of GIOSS for the text database discovery problem 
>£v Luis Gravano, Hector Garcia-Molina, Anthony Tomasic 

V" May 1994 ACM SIGMOD Record , Proceedings of the 1994 ACM SIGMOD international 
conference on Management of data SIGMOD '94, volume 23 issue 2 
Publisher: ACM Press 

Full text available* 153 pdf{1 36 MB) Additional Information: full citation , abstract , references , citings , index 
'™ terms 

The popularity of on-line document databases has led to a new problem: finding which 
text databases (out of many candidate choices) are the most relevant to a user. 
Identifying the relevant databases for a given query is the text database discovery 
problem. The first part of this paper presents a practical solution based on estimating the 
result size of a query and a database. The method is termed GIOSS— Glossary of Servers 
Server. The second part of t ... 

18 XRel: a path-based approach to storage and retrieval of XML documents using 
relational databases 

August 2001 ACM Transactions on Internet Technology (TOIT), volume l issue l 
Publisher: ACM Press 

Full text available* IS pdf(264 27 KB) Add ' t ' onal Information: full citation , abstract , references , citings, index 
• l£j = terms , review 

This article describes XRel, a novel approach for storage and retrieval of XML documents 
using relational databases. In this approach, an XML document is decomposed into nodes 
on the basis of its tree structure and stored in relational tables according to the node 
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type, with path information from the root to each node. XRel enables us to store XML 
documents using a fixed relational schema without any information about DTDs and also 
to utilize indices such as the B+ 

Keywords: XML query, XPath, text markup, text tagging 



19 Placin g search in context: the concept revisited Q 

#Lev Finkelstein, Evgeniy Gabrilovich, Yossi Matias, Ehud Rivlin, Zach Solan, Gadi Wolfman, 
Eytan Ruppin 

April 2001 Proceedings of the 10th international conference on World Wide Web 
Publisher: ACM Press 

Full text available: f£| pdf( 235.9 6 KB) Additional Information: full citation , references , citings, index terms 



Keywords: context, Invisible web, search, semantic processing, statistical natural 
language processing 

20 Information storage and mana g ement in large web-based applications usin g XML Q 
Manirupa Das, Pamela B. Lawhead 

June 2003 Journal of Computing Sciences in Colleges, volume 18 issue 6 
Publisher: Consortium for Computing Sciences in Colleges 

Full text available: Q pdf( 45.55 KB ) Additional Information: ful l citation , abstr act, references, ind ex terms _ 

The Extensible Markup Language [XML], was intended to be a meta-language, when it 
was initially approved as a Web Standard by the World Wide Web Consortium (W3C), in 
February of 1998. Since then, it has come a very long way in applicability and popularity 
and is fast becoming the Standard for Data Interchange over the Web. XML has now 
formed the foundation for a completely new way of communicating across the Internet. 
The power of XML to be applied universally to a number of areas lies in the fa ... 

Keywords: applying XML, data and document management, information management, 
information storage, large web applications, online course delivery 
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